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ABSTRACT 


Drug addiction in the United States generates significant health, economic, and social 
costs. One of the prominent ways in which traffickers smuggle drugs into the United 
States is by maritime shipments from South America. In 1989 Joint Interagency Task 
Force South (JIATF-S) was established to fight these traffickers. JIATF-S collects 
information from multiple sources, which can be broadly classified into two categories. 
The first category is sensor-based sources that produce observations about possible 
targets (e.g., radar, sonar). These observations provide precise location and time but are 
susceptible to false positive and false negative errors regarding their content. The second 
category is human-based sources, including tips, messages and _ intercepted 
communications among humans. In addition to possible misinformation regarding the 
content of an event, such inputs are also susceptible to errors regarding the location and 


time of the event. 


In this thesis we develop a data fusion model that can assist JIATF-S in estimating 
the likelihood that a certain target (i.e., drug-smuggling vessel) is present at a certain 


location at a certain time and evaluate the reliability of the information source. 


The novelty of this thesis is manifested in a new probabilistic approach for 
utilizing human-generated intelligence, and in the way it is combined with sensor- 


generated intelligence. 
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EXECUTIVE SUMMARY 


Data fusion from various sources is a common problem for intelligence organizations 
around the world. In this thesis we explore the efforts of the Joint Interagency Task Force 
South, an organization established in 1989 to fight drug traffickers originating from 
South America, to combine different sources of intelligence into a coherent picture to 


seize the smuggled drugs. 


In this thesis we examine the combination of two categories of intelligence 
sources regarding drug smugglers: (1) sensor-based sources such SIGINT (signal 
intelligence) and VISINT (visual intelligence), and (2) human-based sources such as 
HUMINT (human intelligent) and COMINT (communication intelligence). Sensor-based 
sources typically have high precision regarding location and time of an observation but 
are susceptible to false positive and false negative errors. Human-based sources, 
including tips, messages and communications generated by humans are susceptible to 
these same errors. In addition to possible misinformation regarding the description of a 
reported event, these sources also tend to have low precision regarding the location and 


time of the event. 


We explore several methods for combining information from sensor-based and 
human-based sources. In addition to the traditional Bayesian update mechanism, which is 
commonly used for sensor fusion, we also examine applying Dempster-Shafer theory. 
The Bayes’ method is mathematically rigorous but requires a number of assumptions not 
needed for the Dempster-Shafer methods, namely assuming that the distribution of the 
messages received from the informant is known, and uniform. The Dempster-Shafer 
theory does not make those assumptions explicitly. Moreover, there are several ways to 
implement the Dempster-Shafer theory, and it is not clear in advance which 
implementation would be most appropriate for a given scenario. We compare the 


methods both qualitatively and quantitatively using a simulation. 


Our analysis shows that even when the assumptions of the Bayes’ update process 


are violated, it still manages to yield the best results. The Dempster-Shafer methods did 


XV 


not perform better than Bayes even though they do not explicitly make as many 
assumptions as the Bayes update. As expected, when the reliability of the informant is 
low or is mistaken to be low, and there is non-uniformity in the way he produces 


messages, all the methods performed poorly. 


In addition we develop a Bayesian model to assess the quality of the informant 
and update a vessel’s location simultaneously. We formulate update procedures both 
when the informants’ messages can be verified and when they cannot be verified and we 
must rely only on the current perception about the location of the vessel and the 
informant’s reliability for the update. We suggest a combined scheme that allows 
simultaneous estimation of both the location of the vessel and the reliability of the 


informant as new information becomes available. 
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I. INTRODUCTION 


A. MOTIVATION 


Drug addiction in the United States generates significant health, economic, and 
social costs. According to the National Drug Intelligence Center (NDIC) report from 
2011, “In 2007 alone, the estimated cost of illicit drug use to society was $193 billion, 
including direct and indirect public costs related to crime, health, and productivity... an 
increasing number of individuals, particularly young adults, are abusing illicit drugs. In 
2009, an estimated 8.7 percent of Americans aged 12 or older (21.8 million individuals) 


were current illicit drug users” (NDIC, 2011). 


One of the primary ways drugs arrive to the United States is via smuggled 
maritime shipments from South America. Most of the marijuana seized and more than 
50% of the methamphetamine, cocaine and heroin seized are detected on the Southwest 
border (NDIC, 2011). Figure 1 illustrates that more than 99% of the cocaine flow from 
South America to the United States in 2007 was smuggled through the Caribbean Sea or 
Pacific Ocean via Mexico (Palter, 2009). 





Figure 1. | Northward-bound Cocaine Flows (From Palter, 2009). 


In October 1989 the Joint Interagency Task Force South (JIATF-S) was 
established in order to fight these traffickers. JIATF-S is a multiservice, multiagency 
national task force that conducts counter-illicit trafficking operations, intelligence fusion, 
and multi-sensor correlation to detect, monitor, and hand off suspected illicit trafficking 
targets (JIATF-S, 2013). JIATF-S collects information from multiple sources with 
different characteristics and of different quality. The types of information collected can 


be broadly classified into two categories: 


1) Sensor-based intelligence: Sensor observations are typically characterized 
by high precision regarding location and time of the observation but are also susceptible 
to false positive and false negative errors regarding its outcome. Typical examples for 
sensor-based observation sources are electronic intelligence (ELINT), electronic 
intelligence obtained from sources such as RADAR and non-content signals from 
communication devices, and visual intelligence (VISINT), visual intelligence such as 


video, images and the naked eye. 


2) Human-based intelligence: Tips, messages and communications generated 
by humans, which in addition to errors in content are also susceptible to low precision 
regarding the location and time of the event. The error rate depends upon the reliability of 
the source, which is less well defined than the error rates of sensors. Typical examples of 
human-based intelligence (HUMINT) are intelligence (such as tips and messages) 
gathered from human sources, and communications intelligence (COMINT) are content- 


based intelligence gathered from intercepted communications. 


A main challenge for JIATF-S is the integration of information about different 
spatial locations and time ranges from multiple sources in a consistent and coherent 


manner in order to locate illicit drugs trafficking vessels. 


In this thesis we develop data fusion techniques to assist JIATF-S in estimating 
the likelihood that a certain target (i.e., a drug-smuggling vessel) is present at a certain 
location at a certain time. This information can provide JIATF-S with better situational 
awareness and inform decision makers about where they choose to send search and 


interdiction assets. More specifically, we provide a probability distribution for the 


2 


location and departure time of possible targets. We also evaluate the quality of the 
information sources in order to give their inputs the proper weight (e.g., the reliability of 


a human informant). 


B. CONTRIBUTIONS OF THIS WORK 


Exploitation of human intelligence for targeting has been known since ancient 
times and discussed by many strategists and military theorists, as described for example 
in the famous “The Art of War” written by Sun Tzu in the 6" century BC. Processing 
HUMINT intelligence into spatial information, for instance, to create crime maps for 


policing applications is also known (Ratcliffe, 2000) 


There is also vast literature on the fusion of sensor-based information (Hall & 
Llinas, 2001); however, there is still a need to combine human intelligence with 
intelligence from other sources - a process that it is traditionally done manually by trained 


professionals due to the difficulties of implementing automated algorithms. 


In this work we suggest automated algorithms to combine human-based 
intelligence and sensor-based intelligence in a single framework. We also provide a way 


to estimate and update the perceived reliability of our sources. 


C. THE INTELLIGENCE PROCESS 


Since this thesis relates to the effective utilization of intelligence, it is useful to 
frame this thesis within the intelligence processing paradigm. The intelligence process 
comprises the following six categories of intelligence operations (United States Dept. of 
the Army, 2007): 


e Planning and direction - Planning operations to acquire new or better data or 
develop intelligence sources. 


e Collection - Acquisition of the required data. 


e Processing and exploitation — Converting the collected raw data into 
information that can be used by commanders. 


e Analysis and production — Analyzing the information and producing higher- 
level intelligence from the information gathered via interpretation and 
integration with other relevant information. 


e Dissemination and integration — Disseminating the intelligence to appropriate 
users. 


e Evaluation and feedback — Evaluation of the intelligence performance. 


The following figure represents graphically the intelligence process: 


THE INTELLIGENCE PROCESS 


DISSEMINATION PLANNING 
AND AND 
INTEGRATION DIRECTION 
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AND COLLECTION 
PRODUCTION 
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Figure 2. The Intelligence Process (From United States Dept. of the Army, 2007). 


Since the evaluation and feedback component is used to initialize the other 
intelligence activities, the intelligence process is also referred to as the “intelligence 
cycle.” In this thesis, we concentrate on the processing stage and examine how to 
effectively integrate new pieces of information into the intelligence profile using data 


fusion methods. 


Combining information from different data sources is commonly called data 
fusion, which is defined as a “process dealing with the association, correlation, and 
combination of data and information from single and multiple sources to achieve refined 
position and identity estimates, and complete and timely assessments of situations and 


threats as well as their significance” (White, 1991). 


In this thesis, we concentrate on combining the two types of inputs - sensor data 
and HUMINT input into a single fused intelligence picture. The feedback between these 
two sources can be used, in turn, to reevaluate the quality of the intelligence obtained 
from each source, thus improving the fusion of future intelligence. The following figure 


illustrates, in simple terms, the processing problem considered in this thesis. 












Sensor-based 
intelligence 
Human-based 
intelligence 


Figure 3. Simplified intelligence process model. 





Data Fusion and 
Evaluation 


D. DATA FUSION 


Data fusion may relate to fusion of information of different levels and for 
different purposes. In order to encompass and categorize the different levels, the Data 
Fusion Subpanel (which later became known as the Data Fusion Group) of the Joint 
Directors of Laboratories (JDL) developed the Data Fusion model in 1985 (Hall & 
Llinas, 2001). 


The JDL model as revised in (Steinberg, Bowman & White, 1999) categorizes the 
fusion according to the relation of the information to the entity of interest (in our case the 
drug smuggling vessel) and the purpose of the outcome of the fusion. The following 
levels are included in the model: 


e Level 0 — Sub-Object Data Assessment: prediction of entities that are not 
recognized as an object yet, such as pixels and radio signals. 


e Level 1 — Object Assessment: estimation and prediction of entity states on the 
basis of inferences from observations. 


e Level 2 — Situation Assessment: estimation and prediction of entity states on the 
basis of inferred relations among entities. 
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e Level 3 — Impact Assessment: estimation and prediction of effects on situations 
of planned or estimated/predicted actions by the participants. 


e Level 4 — Process Refinement (an element of Resource Management): adaptive 
data acquisition and processing to support mission objectives. 


The flow of information, from raw measurements to assessment of the entire 


picture with regard to the JDL model levels is described in the following figure: 


Situations|Plans 


Situations|Plans 


Objects Situations 


Signals/Features Objects 


Measurements Signals/Features 





Figure 4. | JDL model information flow (From Steinberg, Bowman & White 1999). 


In this thesis, we explore levels 0 and 1 of the JDL model combining raw data 
information from multiple sources in order to estimate a target’s location and departure 
time. Higher data fusion levels, such as assessing the characteristics and intentions of 


multiple targets are not discussed in this work. 


We will examine two information-fusion approaches in this thesis: Bayesian 


update and Dempster-Shafer Theory. 


a. Bayesian Update 


Bayes’ formula was first introduced in the 18" century (Bayes, 1763), and 
its applications are found in many fields that range from diagnosing the medical situation 


of patients (Lincoln & Parker, 1967) to artificial intelligence (Korb & Nicholson, 2011). 


Many tracking and location algorithms are based on Bayesian methods, 
such as the well-known Kalman filter. The manuscript “Bayesian Filtering: From Kalman 
Filters to Particle Filters, and Beyond” (Chen, 2003) includes an exhaustive review of 
filters and (Morelande, Kreucher & Kastella ,2007) reviews other Bayesian tracking 


algorithms. 


Bayesian methods are also applied extensively to sensor fusion. The book 
Handbook of Multisensor Data Fusion (Hall & Llinas, 2001) presents an overview of 
data fusion methods that includes several chapters regarding Bayesian updates and treats 


it as the basic method for data fusion. 


In this work, the Bayesian method is used in a similar manner to update 


the probability that a target is at a particular location at a certain time. 


However, Bayesian methods also have limitations. In particular, they 
require intimate knowledge of sensor capabilities, such as estimates of the error rates, a 
notion of the distribution of the possible errors of the sensor and assumptions regarding 
the state of the world, such as a prior distribution. For those reasons, we also consider 


other information updating methods. 


b. Dempster-Shafer Belief Method 


Dempster-Shafer theory (DST) was developed by Arthur Dempster and 
Glenn Shafer (Dempster, 1967; Shafer, 1976). This theory, which is in some sense a 
generalization of probability theory, allows for assigning “belief” values (and not 


probabilities) to events and sets of events, thus requiring fewer assumptions and axioms. 


Due to the theory’s ability to deal with complicated types of variability in 
belief, Dempster-Shafer theory has been used widely for decision making algorithms and 
data-fusion. An Introduction to Bayesian and Dempster-Shafer Data Fusion (Koks and 
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Challa, 2003) is a good introductory summary to Dempster-Shafer theory in comparison 


with Bayesian methods. 


Hall and Llinas (2001 also includes several chapters about Dempster-Shafer 
theory while Sentz (2002) is an extensive report about different Dempster-Shafer 


methods and their applications. 


E. OTHER POSSIBLE APPLICATIONS 


In this thesis the framework suggested for locating drug traffickers may be 


applied to a range of related applications. 


1) Locating friendly forces: A similar problem to the one described above on 
the sea can occur on land as well, when data from sensors such as radars are combined 
with information from human sources in order to locate a friendly force in need of 


assistance on the battlefield. 


2) Other types of sensors: Traditional updating mechanisms require an 
intimate knowledge of the technical parameters that determine the performance of a 
sensor and the environment in which it is used. When this knowledge is lacking, more 
robust methods, such as the ones explored in this thesis, can be of use. Those methods 


can be applied not only to SIGINT and HUMINT, but also to other types of intelligence. 


F. THESIS OUTLINE 


This chapter includes the background and problem description. Chapter II 
describes the problem, the basic integration methods used, the assumptions, and the 
details of the Bayesian update and Dempster-Shafer belief theory and their application to 
the problem. In Chapter III we compare those two models and develop a simulation to 
gain additional insights. Chapters IV and V include a detailed mathematical framework 
of extensions to the base model described in Chapter Hl. While Chapter IV includes a 
model that handles multiple routes, Chapter V includes a framework that allows 


estimating and updating the reliability of the sources. 


UW. THE MODEL 


In this chapter we describe the scenario, the theater of operations and the goals of 


the JIATF-S operator. We define the different types of intelligence received and describe 


a model that updates the situational awareness regarding this scenario by combining 


intelligence from different sources together. 


A. SCENARIO 


We consider a drug-smuggling situation from the northern part of South America 


to Central America. Drug smugglers may leave from multiple points of embarkation in 


the northern part of South America towards one of multiple final destinations in Central 


America and Mexico. The smugglers use three kinds of vessels for their operations: 


GO-FAST small boats - designed to reach high velocities but with relatively 
low capacities, 


Merchant Vessels - high capacity but slow and easy to detect, and 


SPSS (self-propelled semi-submersible) are partly submersible. These vessels 
are difficult to detect by radar, but their velocity is slow. 


The types of vessels have very different characteristics, and therefore, it is usually 


easy to distinguish among them. There are also several categories of typical routes the 


smugglers use: 


Close to the shore in the Pacific ocean, 

Close to the shore in the Caribbean sea, 

In the Pacific ocean, via Galapagos Islands, 

Straight routes between the embarkation point and the destination, and 


Piece-wise linear routes between the embarkation point and the destination. 
(This category essentially covers all possibilities). 


Figure 5 shows examples of those routes. 





Figure 5. Typical smuggling routes. 


JIATF-S operators desire enhanced situational awareness about the location of 
targets in order to more effectively direct interdicting assets that can seize the smuggled 
drugs. JIATF-S operators increase situational awareness by obtaining information from 
sensors such as radars and cameras, as well as from human sources. Often, new 
intelligence arrives in real time. In those cases, the operators should update their 
perceived probability that a vessel is located in a specific place according to the new 


intelligence. 


JIATF-S use their situational awareness to send a surveillance aircraft or surface 
vessels to look for smugglers in the suspected location. If a smuggler is positively 
identified, the surveillance vehicle holds contact with the smuggler’s vessel until a 


maritime force boards it and confiscates the drugs. 


The operators must also evaluate the quality of the different intelligence sources 


in order to weight their information contributions appropriately. 


B. ASSUMPTIONS 


For tractability, we initially make the following assumptions, some of which we 
relax later. These assumptions reduce the problem to estimating a single parameter: 


departure time. 
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1. Single Target Vessel 


There is at most one target vessel in the theater at any given time. Although in 
reality multiple vessels might be present in the theater at the same time, it is assumed that 
JIATF-S’s operators are able to associate incoming information with the correct vessel. 
In other words, in this thesis we do not consider data-association problems that might be 


a subject for future research. 


Zz Constant and Known Speed 


The speed of the vessel is constant and known. This assumption is reasonable 
because the speed of the vessel depends mainly on the vessel’s type (which is usually 
known), and may not change much during the course of its movement. Since the velocity 
of each vessel type is known and its variance is rather small, this assumption is 
reasonable. Future work may consider variable velocity due to weather conditions, 


refueling stops, strategic considerations or other factors. 


3. Discrete Departure Time Distribution 


The set of departure times is discrete and finite; the vessel can leave the harbor at 
any one of several possible time slots. As the discretization can be as fine as necessary, 
this assumption does not affect the results of the model. A reasonable size of a time slot 


would be one to three hours. 


4. A Single Known Route 


To start, we assume there is only one possible route for simplicity. This 
assumption is relaxed later on to include multiple routes. Since the route and the speed of 
the vessel are known and fixed, the location of the vessel is uniquely defined by the time 


of departure. 


C. DEFINITIONS 


A random variable, 7, denotes the departure time of the vessel. 


11 


For simplicity we assume that 7’ is discrete, and its possible values are in the set 


T= ete , So there are 7 possible departure times. 


The probability mass function of the departure times is f (1)= P(T, =i) We 


assume a prior for this distribution, and our objective is to update this prior as new 
information arrives. As more intelligence arrives, the posterior should narrow around a 


handful of most likely departure times to aid in the routing of surveillance aircrafts. 


1. Sensor-based Intelligence — Observations 


An observation is a random event associated with a certain time, t’ @7. An 
example of an observation might be the event that the operator received a radar reading 
regarding a certain time and location. Since we assume a single route and fixed velocity, 
an observation that is made at any location along the route can be trivially translated to an 
observation made about a perceived departure time. For instance, if the speed is 30 knots, 
and we have an observation at distance 60 NM on the route at 4 p.m. it is equivalent to an 
observed departure at 2 p.m. This allows us to locate the ship at any desired time after 


disembarkation. 


The formal definition of an observation is: 


O,,, - a positive observation (a vessel departed at time 1’). 


O,,_ - anegative observation (no departure at time 7’). 


The observations may be subject to the following errors: 


P(O,., Lee x ') = Beas False positive error, the sensor reports a departure while 
there is none. 
P(o 


= ') = oe False negative error, the sensor fails to detect a departure. 


The error probabilities depend on the detection and classification capabilities of 


the sensor and on the characteristics of the environment. In particular, the probability for 
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false positive error depends on the number of non-target vessels and debris in the area of 


the point of embarkation. 


2. Human-based Intelligence — Messages 


While an observation is associated with a specific time of departure, messages are 
less specific and may include a range of possible departure times. An example of a 
typical message may be an informant relaying, “I’ve heard that a ship might embark 
between 8 a.m. and noon” or perhaps getting a hint via telephone communication that 
“The drug dealers will leave on one of the following mornings . . .” As mentioned above 
in Section B, we first consider only the time ambiguity and assume a single known 


embarkation point. This assumption is relaxed later. 


Let M denote the random event that a certain message is received. The sample 


space of those events (the possible messages) is all the subsets of the departure times 
ienee =T except for the empty set. Let k denote the cardinality of the message, that 
is| M | =k. Thus, if the event M occurred, the informant claims that the departure time is 


one of the k valuesin M. 


D. BAYESIAN UPDATE 


1. The Update Process 
As before, f (x) is the probability mass function of the true departure time, and 


P(T, 7 t) =f (x) is the probability that true departure time is ¢. According to the Bayes’ 


formula, the update probability with new information is defined as: 


P(New information |7, = t) : Te (7 = t) 


P (e =t|lNew information) = P (New rsa tion) 


new 


(2.1) 
From Equation (2.1) it follows that in order to calculate the updated probability 
distribution using Bayes’ method, one requires a prior probability are (Z, = t) = Tiss (t). 


This probability reflects the prior information we have about the vessel’s departure time. 
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Absent any information we use as the default the uniform prior efed ~U [T |. 


However, if we have some general information about the likelihood of different departure 
times (for instance, if we know that the likelihood of departing at low tide is much higher 


than at high tide), we can integrate this knowledge by altering the prior. 


Equation (2.1) can be rewritten as a product of an update function and the prior 


information regarding the distribution: Flt)= Sip welt)" Ie 


a lbs with the update 


function being: 


P(New information |7, = t) 


Le (1) = 


P(New information ) (2.2) 


In the following chapters we show how to define the update function for different 


intelligence types. 


Zz. Bayesian Update Following an Observation 


As described before, there are two types of observations, positive and negative. 


We shall calculate the update function for each of those cases. 
Applying Equation (2.2) to the case where the new information is a positive 


observation at time ft’ , and using the law of total probability for the denominator, the 


updated probability distribution is: 


P(O,1T, =1) 


ru 


Ca 
aoe IT, = s) : f(s) (2.3) 


By definition of false positive and false negative errors, the probability of 


receiving a positive observation regarding time t’is 1— ge if the true departure time is 


indeed t¢’ and Pe otherwise: 


1-P_ tet 
f- 
P(O,,, IT, = t) =e Be 
f+ (2.4) 


And so we can calculate the total update function, given that we know the values 


of 1m and LS 


o> ti ee sci 
pi (*)= (i-P_)-s(r)+P,, (1- f(#")) 
~ ; P, + ’ (2 5) 
t#t : 


(1- P|: £(t')+ P.-(1- F(t) 


Similarly, we can follow the entire calculation for the case of receiving a negative 


observation, and we get the following update function: 


f- t=-t' 
Ie f(#')+(1-P,,)-(1- F(t) 
a update (1) a 1-P 
f+ tet! (2.6) 
P_-f(t')+(1-P,.)-(1- £(¢)) 
3. Bayesian Update Following a Message 


For a given informant and a message of size k we define q, to be the probability 
that the message is correct: q, = p(T, eM || 4] = k). That is, we assume that the quality 
of the informants depends only on the size of the message. The exact mathematical 
definition of this parameter will be discussed in the following chapters. 

We assume that q, is monotone non-decreasing in k and q =1 where the 
informant gives the entire possible set of departure times as the message. An additional 
assumption here is that g, does not depend on the content of the message but only on its 
size k. 


The update of the probability distribution of the random variable Tz following a 


new message m is done in a similar way to the observation case: 
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P(MIT, =1) 


a) 7 
> P(MIT, =s)- f(s) (2.7) 


Differentiating between the case when “t is in the message” and “ft is not in the message” 


and using the law of total probability, we obtain that the update function is: 


P(MIT,=1,teM)-q, 
> P(MIT, =s,s €M)-q,-f(s)+ > P(MIT, =s,s <M)-(1-4,)- f(s) 
| P(m ir, =1.1¢M)-(1-4,) 


> P(MIT, =5,s €M)-q,- f(s)+ > P(MIT, =s,s ¢M)-(1-q,)- f(s) 


seM s¢M 





teM 


f, update (:) 





t¢M (2.8) 





However, in order to calculate the value of Le md) we must know the values of 


the expressions P( MIT, =t,t eM) and P| M|T =t.t ¢M). In simple words, we need 
d d ) € 


to know the probability of receiving every possible message given every possible 
departure time. Since this information is practically impossible to acquire, some 
additional assumptions must be made. 

We assume that the probability of receiving a message of length k is /;, and that all 
messages of a certain size that include the true departure time are equally likely. In other 
words, if for example the possible departure times are T = estan and the real 


departure time is f;, then receiving jt,,t,; is as likely as receiving jt,,t,¢ or j¢,,t,;-In 
Pp 8 12; y 8 173 1°74 


general, since the number of messages of size k that include a certain departure time s is 


7h 
[ ; | is the conditional probability of receiving a certain message that includes the 
—l 


true departure time f is: 


M|=k)=1 a 


ae (2.9) 





P(MIT, =1,t eM, 
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Similarly, there are [ ms ) possible messages of size k that do not include a 


certain departure time 7, and assuming they are all equally likely brings us to the 
following expression for the probability of receiving a certain message given that it does 
not include the true departure time f: 

1 


=k)=1,-7—+ 


P(MIT, =1,¢M, > 
n— 2.10) 
Lk) 





Combining Equations (2.9) and (2.10) we have the probability of receiving the 


message m of size k: 


— 
— 
NWS 





teM (2.11) 





Thus, the denominator of the update function (2.7) becomes 
P(M)= Zs (s)-4 -P( MIT, =s]M|=k) 


= DA) ha-7 yt hs (a a) — (2.12) 


s¢M 


ee a 


Combining the numerator (2.11) and the denominator (2.12) of the update 





function, we have the following function: 
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LUT oi 
( k-1) teM 
EI) 9- 7 EI) (ta) 
k-1 J ( k ) 
P(MIT, =1)M|=k)= 
1, -(1-4,) ee 
z téM 
LA (s)-4 4, / 7 yt LF) O-a)7 = 
k-1) ra, 
Simplifying this expression yields the following update function: 
er 
k 1 teM 
Sie 
i (:) = si sae 
ae (2.14) 
n—k i teM 
q q 
Es) H+ ¥ 1) 


1 
As an example, if the prior is a uniform prior function, f be (1 ) =—, the updated 
n 


distribution can be simplified to be: 





1 

ee A ee teM 

k 4, n—-k 1-q, k 
k n on-k 

foy(t) 1 l-q, (2.15) 
n n—k eh teM 
q, l-q, n-k 
LE ls) e+ LAs) ay 
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(2.13) 


Thus, we have showed how to update the probability distribution after receiving 
an observation or a message. 

4. Example 

The following example demonstrates the process described above. Suppose there 


are only four possible departure times: T = {t,.tyst,,t,} of which the true departure time is 


t}. 


We receive a message of size k=2, M= ae and q2=0.9. Also, let us assume 


that the prior probability distribution of the departure times is uniform: 
oo (:,) = ioe («,) = jee (x,) = a (:,) = 0.25 


A pets 
There are [ ; dies possible messages of size 2 that include the true 


departure time, and the probability of receiving one of them is equal to 0.9. Similarly, 


n 


( eet | 
there sae ‘ : ) = 3, the probability of receiving one of them is equal to 0.1. 


In this case, the updated function is: 





5-09 
Frevlt) = Frey (fa) = > — = 045 
J 30049-— 04 
3 3 
so (2.16) 
j= 5 \e= 3 = 0.05 


24.9942-4.01 
3 3 
E. DEMPSTER-SHAFER BELIEF THEORY 
i, Background 


As discussed in the previous section, using the Bayesian method requires us to 


make significant assumptions that may be difficult to justify. In order to avoid those 
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assumptions and allow for a more robust update mechanism, we explore the non- 


Bayesian Dempster-Shafer Theory (DST) methods. 


The Dempster-Shafer theory defines sets of possible outcomes (or realizations) to 
a random variable similar to standard probability theory. However, unlike the definition 
of a probability distribution that assigns probabilities to exclusive outcomes, Dempster- 
Shafer theory is more general and assigns “mass” values not only to events but also to 


sets of events. This allows combining pieces of information in a more flexible way. 


In the following paragraph, the basic Dempster-Shafer theory is defined 
mathematically, following Chapter 7.2.3 in (Hall & Llinas, 2001). Let T be a set of 


mutually exclusive outcomes of an experiment (“frame of discernment”) and Q=2" is 


the power set of 7. The belief method assigns a “mass of evidence” m to elements in Q. 


The mass of evidence allocation, denoted by m, obeys the following rules: 


m(g)=0 (2.17) 

m(A)>0,VA <Q eis 
> m(A)=1 

AcQ (2.19) 


Similar to probability, the mass of the empty set is 0 (Equation (2.17)), it is larger 
than 0 (Equation (2.18)) and the total mass sums to 1 (Equation (2.19)). Unlike 


probability, the mass can be defined to subsets of Q = 2’ (Equation (2.18)). Intuitively it 
can be viewed as similar to probability theory, but when a mass is assigned to a set, the 
probability can still “shift” between the elements in the set when new information is 


acquired. 
Now we define useful terms of Dempster-Shafer theory: 


belief (Bel): 


BCA (2.20) 


The belief of A can be interpreted as the mass of evidence assigned to A and all its 


subsets. This is the minimal probability that is already assigned to A. 
And plausibility (PI): 
PI(A)=1— >) m(B)= >) m(B) 
ANB=¢ ACB#¢ (2 . 2 1 ) 


The plausibility is the mass of evidence that can possibly be assigned to A in the 


future (not assigned to any subsets that do not intersect with A). 


The interval [bel(A), pl(A)] can serve as a confidence interval for A’s 


probability (Hall & Llinas, 2001). 


As an example, let’s assume we have one observation from a source, stating: 
“With probability 90%, the vessel departed at t;, and with 10% the vessel could have 


departed at any time. 


In this case, the masses are distributed as follows: 


m({t,})=0.9 


m(T) =0.1 (2:22) 


And so the belief and plausibility can be calculated to be: 


(2.23) 











This is a very simple example, but we can already see that as expected, 


the ’confidence interval” for t; is between 0.9 and 1 as expected. The confidence interval 


for tz, about which no information was given, is also calculated to be [ 0,0. 1]. 


21 


On the other hand, if the source stated that “With probability 90%, I’m sure the 
vessel departed in t;, and with 10% probability the vessel could have departed at any 


other time. 


In this case, the masses are distributed as follows: 


m({t,}) =0.9 


m(T —{1,})=0.1 (2.24) 


And so the belief and plausibility can be calculated to be: 


(2.25) 





which yields different results as now the confidence value for T — {1,} iS [ 0.1,0.1]. The 


mass given to this set is 0.1. A message can be defined in a similar manner, but with 


regard to multiple departure times: For an informant that says that the departure time is 
one of Gea , and we know that he is correct only 0.9 of the time, the corresponding 


mass assignment should be: 


m({t,.t})=0.9 
m(T -{1,,1,})=0.1 (2.26) 


And the belief and plausibility in this case are: 


(2.27) 








2 Combination Rules 


Now that we have defined the Dempster-Shafer framework, the next step is to 
discuss how masses assigned by different sources should be combined in order to make 
sense out of multiple sources of information. The desired outcome of such a combination 


of mass assignments would be a new mass assignment. 


Probabilistically, if we have the probability assigned for each outcome by 
multiple sources and we assume independence, the combined probability distribution can 
be obtained by simply multiplying the probabilities assigned by different sources. In the 
Dempster-Shafer theory, the situation is more complicated and thus multiple combination 
rules are suggested with slightly different characteristics. The different combination rules 
are thoroughly discussed in Sandia Lab’s report (Sentz, 2002). In the following 


paragraphs, we describe a few of the prominent rules in more detail with examples. 


a. Dempster-Shafer 


The first suggested rule is the Dempster-Shafer rule. Given two mass 


assignments by different sources m, and m,, the combined mass assignment m, ,of a set 


A is calculated by adding up the multiplication of masses for all the sets B, C such that 


their intersection is A: 


(2.28) 


where K is a normalization factor that accounts for the conflict - all the pairs of sets that 
have empty intersection and therefore their corresponding mass can not be assigned to 


any set: 


K= ) m,(B)m,(C) 


Boca (2.29) 
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As an example assume the first message specifies departure times ene 
with probability of 0.9 and the second message corresponds to times tetas out of 5 


possible departure times S = {t fo higt t,}: 


m,({t,,t,}) =0.9,m,({¢,,1,,4,})=0.1 
m,({t,.t,}) = 0.9m, ({1.t,.t,})=0.1 (2.30) 


The combined masses can be calculated by determining the masses of the 


intersections: 


nth m,({tysty st, \)=01 
)=0.81 —m,,({1,})=0.09 
)=0.09  m,,({t,,t,})=0.01 (2.31) 





Tt 
een 
es 
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es 
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In this case, there is no conflict between the sources; each two sets with 


positive masses have a non-empty intersection (K=0). 


However if informants submit conflicting messages, the situation will be 
different. Let us see what happens if in addition to the information before, the first 
informant is positive that the true departure time is not f3, and therefore he does not 


assign any masses to sets that include f3, resulting in the following mass assignments: 


m,({1,,t,})=0.9,m,({1,.,t)=0.1 


[5 })=09.m (ft }= 0 on 





~001 (2.33) 





Now there are two sets in the different assignments that have an empty 


intersection (the upper-right cell in the table). This mass is added to K, that measures the 
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amount of “conflict.” The way Dempster-Shafer handles this situation is by normalizing 


the conflict out, eventually assigning: 





m,.({t,}) are ~ 0.89 


m,»({t}) oe =0.1 


m,,({t,1,}) = er ~0.01 





(2.34) 





Although this method may work well when the conflict is small, some 


paradoxes arise when the conflict between assignments is substantial. For example, if the 


first informant is almost certain that the departure time is t, but it might be Es and the 


second informant is quite sure about f, , then: 


m,({1,})=0.9,m,({1,}) =0.1 


m,({t,})=0.9.m,({t,})=0.1 (2.35) 





m({r,})=01]} K=009 — m,,({1,})=0.01 (2.36) 





0.01 
Paradoxically, the combined assignment will be m,»({4})=4 0.99 =t, 


disregarding the possibility of ¢, or t,. Resolving this paradox is one of the main 


incentives in examining other combination rules. 


This rule can be generalized to combine more than two messages: 


(2.37) 


2 


with K being: 


K= > m(B,)m,(B,)---m,(B,) 
B= (2.38) 


IsiSn 


b. Yager’s Modified Dempster-Shafer Rule 


This method is similar to the basic Dempster-Shafer rule, with one 
difference: instead of normalizing the masses m by 1-K, K is added to the mass of the 
entire set T. The mass of the entire set T can be interpreted as the “ignorance,” since this 


mass does not help in distinguishing between different departure times. 


We now revisit the high conflict example defined in (2.35) and (2.36). 


Following the same calculation, the final assignment would be 


m, 5 ({r,) = 0.01,m,, (7) = ().99 with confidence intervals: 


(2.39) 








Suggesting that since the conflict is so large, every outcome is basically 
possible. This rule can also be extended for more than two evidences, but it is not 


associative. It is, however, commutative as Equation (2.37) is symmetric. 


c. Zhang’s Center Combination Rule 


This rule is yet another extension of the Dempster-Shafer combination 


rule. While the Dempster-Shafer rule does not account for the intersection between two 
sets, Zhang’s rule does by multiplying the assigned outcome mass by a metric r(B.C ‘: as 


logically the mass assigned to the intersection of B and C should increase with its size. A 


common metric is the cardinality of the intersection: 


(2.40) 


and the combined mass is: 


mA)=kS r(8.C)m(8)m(C} 


was (2.41) 


with k a normalization factor such that > m,,(A) =1, (not the same as K that 
AcS 


accounted for the conflict in Dempster-Shafer and Yager’s rules). 


To see the difference between Zhang’s rule and the regular Dempster-Shafer 


combination rule, let’s look at the case where we have a positive observation regarding 


time ¢, and message about times f, or f,. 


m,({1,})=0.9,m,({1,,t,,t,,t,}) =0.1 
m,( {1,,t,})=0.9,m,({t,,t,,t,}) =0.1 (2.42) 


As before, we calculate the masses of the intersections: 


m,({t,})=0.9 — m,({t,.tt,.t,})=0.1 





m,({t,t,})=0.9 | m,,({t,})=0.81 — m,,({t,})=0.09 


m,({t,t,t})=01] K=0.09 — m,,({1,t,,t,})=0.01 (2.43) 


3? 4? 
So the intervals for the departure times are: 


m_| Bel | Pl 
{t,} | 0.89 | 0.89 | 0.89 


{t,} | 0.1 | 0.99 | 0.99 
{t,.t,,t,} | 0.01 | 0.01 | 0.01 


cae ae) 





However, with Zhang’s rule we also calculate the value of r(B,C ) = 


each intersection: 
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m,({t,})=0.9 m,({t,,t,t,,t,})=0.1 
m,({1,,1, })=0.9 m,,({1,})=0.81r=> m,,({t,]) = 0.09.r== - 
m,({t,t,.t,})=0.1 = m,a({tst.t5}) = 00Lr == (2.45) 





And by using Equation (2.41), the final combined masses according to Zhang’s 


rule are: 





{t,} | 0.967 | 0.967 | 0.967 
{t,} | 0.027 | 0.027 | 0.027 
{t,.t,,t,} | 0.006 | 0.006 | 0.006 


(2.46) 





Since the size of the intersection between m,({2,}) and m,({t,,t,}) is 


much bigger than the others (which can be interpreted as a better agreement), m,.({1,}) 


receives a higher mass in Zhang’s rule in comparison with Dempster-Shafer’s 


combination rule. 


A conflict between assignments is resolved by the normalization 


coefficient k similar to Demspter-Shafer’s combination rule. 


d. Mass Mean 


This rule of combination is the most straightforward one. According to the 
mass average rule, the mass of a set in the combined mass assignment is simply the 
average of masses the set received: 

my o(A)=4Em (4) 
I<i<n (2.47) 

Weighting the average according to some measure of confidence or 


reliability is also common, but this issue is not discussed in this thesis. 
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3. Observations and Messages 
Defining observations and messages according to Dempster-Shafer scheme is not 


straightforward. We define these terms as follows: 


We translate a positive observation with “correctness” q regarding time f; into a 
mass assignment of: 


n({i3)=aml -[o))=1-4 


(2.48) 
This means that according to this observation, with probability q the true 
departure time is ¢; and with probability /-g it is any other time. 
Similarly, a message M with “correctness” qx (|M | =k) can be translated to the 
following mass assignment: 
m( M) = q,.m(T - M)= Lg, 
(2.49) 


In this case the true departure time is included in the message M with probability 


qx, and with probability /-g, it is not in the message. 


4. Transforming Dempster-Shafer Belief and Plausibility Measures to 
Probability Values 


In order to support decision makers we must have some well-defined probability 
about the departure time of the target. This is straightforward with the Bayesian method, 
but the Dempster-Shafer theory only allows us to obtain Belief-Plausibility intervals. The 
following subsection describes a few methods of translating the belief function (or mass 


function) to a probability value. 


a. Pignistic Transformation 


The most popular transformation is the Pignistic Transformation (Smets, 
1990) that distributes the mass assigned to a set uniformly among all the members of the 


set: 
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P,,,(t)= 5a [A cQs.tt cs} 


(2.50) 


where Pe (1) is the estimated probability of departure time ¢ by this transformation. This 


transformation is intuitive, when we are given the mass of a subset and we are required to 
estimate the probability of each member of the set, a natural assumption is that the 


probability of all the members is equal. 


Let’s examine the outcome of this transformation in the case of a single message: 


m({t,,t})=0.9 
m({t,,t,,t,})=0.1 (2.51) 


And the Belief and Plausibility in this case: 


(2.52) 





Then the pignistic probability of each time is: 


PoP (6) > = 0.45 


pig 


Poe («,) = “ 5 Poe («,) = Pie («,) ~ 0.033 


Smets (2002) claims that the pignistic is the transformation adequate for 


making decisions; however, it does not necessarily represent your belief: 


At the creedal level, beliefs are represented by a belief function; at the 
pignistic level, this belief function induces a probability function that is 
used to make decisions. This probability function should not be 
understood as representing your beliefs, it is nothing but the additive 
measure needed to make decision, i.e., to compute the expected utilities. 
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However, one of the principles of Dempster-Shafer theory is that with 
further information the mass can shift within the set. This transformation, however 
intuitive, disregards this flexibility by dividing the mass equally among the members of 


the set (Cobb & Shenoy, 2003). 


b. Plausibility Transformation 


Another transformation is the Plausibility transformation, which is 


basically a normalization of the plausibility function of the singletons. 
P, (1) = K*Pi({t}) 


with K being the normalization factor: 


K => Pi({r} it <Q) 


(2.53) 


(2.54) 


This transformation tries to keep the essence of Dempster-Shafer theory by 
considering the plausibility value of the departure times (that can be interpreted as the 
potential, or the biggest mass that can possibly be assigned to it if the right information is 
received) and normalizing those plausibilities. Looking at the same example as in the 


previous subsection, defined by Equation (2.51), we obtain: 


K= > Pi(t,)=0.9+0.9+0.14+0.1+0.1=2.1 


lsi<n 


(2255) 


The probability assigned to t; and f2 is lower now, to account for the fact 
that there are three other departure times that are plausible. 
c. Belief Transformation 


A less useful transformation is the Belief transformation that normalizes 


the beliefs of the singletons but disregards information about non-singletons. This 
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transformation will not be considered in this work (and in the example discussed it does 


not make any sense). 


F. DISCUSSION 


We have suggested two different probability update models: The Bayesian update 
and Dempster-Shafer Theory. Both models allow updating observation and messages. 
The Bayesian method is well known, popular and mathematically rigorous. However, it 


requires multiple assumptions that are often difficult to justify. 


The Demspter-Shafer Theory includes multiple combination rules. There is no 
clear method that is considered appropriate for all situations According to our mass 
assignments as defined in (2.24) and (2.26) we do not expect to have any conflict since 
all the possible departure times are members of sets that have a positive mass, and 
therefore, Dempster-Shafer and Zhang’s rules are reliable and justifiable (Sentz, 2002). 
Yager’s rule is useful whenever there is large conflict; however in our case, since there is 
no conflict, it is no different than Dempster-Shafer’s rule. The mean combination rule 
may not be appropriate when averaging extremes, but it is easy to compute and might 


provide satisfactory results in certain cases, and therefore, is also of interest. 


The transformation step from Dempster-Shafer theory measures to probabilities is 
also possible via several distinct methods. Although the pignistic method is more 
common in the literature the Plausibility method has some appealing characteristics, and 


it is More consistent with Dempster-Shafer theory (Cobb & Shenoy, 2003). 


In the following chapter we will compare Dempster-Shafer, Zhang’s and the mean 
methods, each of those transformed into probabilities by the two transformations models 


and the Bayesian approach. 
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Hil. BAYES AND DEMPSTER SHEFER THEORY COMPARISON 


In this chapter, we compare the Bayesian update process and Dempster-Shafer 
methods. We first discuss the qualitative pros and cons of each method and show the 
equivalence between the two in certain situations. We conclude with a section describing 


the results from a simulation experiment. 


A. QUALITATIVE COMPARISON 
As discussed in the previous chapter, the Bayes’ method requires a prior 


robability distribution f(t). Dempster-Shafer theory does not require a prior for 
Pp y prior p y q 


computing the update distribution although it can incorporate such a prior as an 
additional piece of information. The biggest advantage of Dempster-Shafer theory in our 


context is that it does not require one to specify the probabilities of receiving a message 
given the actual departure time: P(M IT, =t,t <M) and P(M IT, =t,t ¢M). Data to 


estimate these conditional probabilities may not be available and so assumptions that are 
difficult to justify have to be made to obtain them. However, once we impose these 
assumptions, calculating the updated probabilities is straightforward using Bayes’ 
theorem, and the result of the update is unique. Dempster-Shafer theory may utilize any 
one of multiple combination rules, and those rules may produce very different results 


depending on the agreement between the different pieces of information received. 


The output of the Bayesian update gives us the estimated probability that a given 
departure time is in fact the true one. The Dempster-Shafer theory output is a distribution 
of masses that allows us to calculate Belief-Plausibility confidence intervals regarding the 
time departure. Those intervals give some insight about the probability of the vessel 
departing at a certain time but in order to make decisions, and in particular when those 
intervals are large, an additional transformation is required to obtain probabilities. This 
transformation results in a degradation of the flexibility that makes Dempster-Shafer 


theory so appealing. 
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Computation-wise, the Dempster-Shafer theory machinery is much more 
intensive since each member in the power set of T may be assigned a mass. A Bayesian 
update assigns probabilities only to the members of the set itself, not its power set. 


However, the Bayes’ method may incur additional computational costs if calculating 
P(M IT, =t,teé M) and P(M [F == M) from existing data is difficult. The 


following table summarizes the main differences between the Bayes’ updating process 


and the Dempster-Shafer theory: 




















Bayes’ Method Dempster-Shafer Theory 
Prior A prior distribution of the Prior distribution not required 
outcomes is required 
Event The probability of receiving | Only mass distribution is required 
distribution each message in every state 
of the world is required 
Combination | Bayes’ Formula Several different combination 
Rules rules 
Output Probability distribution of Results in Belief and Plausibility 
outcomes of outcomes and sets of 
outcomes. Requires 
transformation to obtain 
probabilities 
Computation | Computationally easy, Computationally intensive, 
assigns probability values to | requires assigning values to 
the members of 7. members of the power-set of 7. 














Table 1. | Bayesian update - DST comparison. 


B. BAYESIAN UPDATE - DEMPSTER SHAFER ZHANG EQUIVALENCE 


As we saw in Chapter II, the update process can be performed using multiple 
methods. In this section we show that the Bayesian update and the Dempster-Shafer 
Zhang method with a pignistic transformation produce the same probabilities under the 
model assumptions described in Chapter II. As stated, for the Bayesian model we assume 
that (a) g, 1s known, (b) the probabilities of receiving true messages of a certain size are 
equal and the probabilities of receiving false messages of the same size are also equal, 


and (c) the Bayes’ prior is a uniform distribution. The Zhang combination method is 
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performed as described in Chapter II, Equation (2.41). Assuming that (a)g, is known and 


(b) receiving message of size k, M; yields the following mass assignment: 


m( M,)=4, 


m(T - M,)=1-q, oe 


In order to show the Bayes - Zhang pignistic equivalence we calculate the 
probabilities assigned to a certain departure time t¢ by both methods after N messages 


are received. We assume that the informant included the specific departure time f in 


exactly NV. messages out of the N received, and N| =N-N.. 


1. Bayesian Update 


Recall from Equation 2.14 the update equation when the message size is k : 


are ee | = (3.2) 


KF _ 2 BY 


where f (s) is the probability distribution before the update. Since the denominator of 


the update function is merely a normalization coefficient, we can state that without 


normalizing, the updated distribution is: 


teM 
Fd) a 1 (3.3) 
—4, 
téM 
n—-k 
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Applying Bayes’ theorem and using the fact that the message probability depends 
only on the true departure time and q,, and that the messages are independent given the 


departure time, allows us to formulate the update function after N messages: 


Pg (T,= 11MM eM) = ah = elt =) 





P(M{IT, =1)-P(M?IT, =1)---P(MS IT, =1)-P,,,, (7, =2) (3.4) 
7 P(M!,M?.....M") 
Since P(M, IT, = t) oc : for messages that include ¢ and P(M, IT, = t) oc — 


for messages that do not include ¢, the update function after NV messages that include t 


and N = that do not include f¢ is: 





sate 
Foie) me) 


(3.5) 


This update function can later be normalized, although it is not required for 


proving the equivalence to the Pignistic Zhang method. 


2. Pignistic Zhang formulation 


Recall the Zhang’s combination rule from Equations (2.40) and (2.41): 





AB 
m =k > m 3)! 
ial ae yaya (3.6) 


The specific departure time f is included in M, in Not the messages, and 


included in T-M; in N of the messages. In order to eventually calculate the probability 


of departure at time ¢ let us first look at the intersection of the sets that include ¢ for all the 


messages, C: 


CaMinMounitoft-Me}oun (tM 
(3.7) 
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Since for every message M;, t is included either in M; or in {T —M, \ , the sum in 


Equation (3.6) is reduced to a single term and C can be interpreted as the set that 


includes t after the combination of all the messages received. The mass of C is: 


Th m()- TT j IC| 
combined C i M, : i T M; . E : 
J TT mt ae ( J | [|mi: IT |{r-3}} (3.8) 


iteM, jteM] 





We know that the mass assignment is m(M,) =, m(T -M,) =1-—q, and that 


the size of a sets are \M,| =k and {7 —M, \ =n-—k. By substituting those expressions in 


Equation (3.8) we have: 


Nin (1 — Now 
q (1 q) aa 


(C) x N, Nout 
kX (n—k) (3.9) 


m 





combined 


Once we have the combined mass assignment, we can calculate the pignistic 


probability of time ¢ according to Equation (2.50): 


m(C) q’™ -(1-ay™ |C|_ a -(1-4)™ 
P,.(t)= 4 AIC Ost Chae Ad IE _ dd 03.19 
(i) >| ic EQs.t.te |. 7 (n—k)y™ 7 ky (3.10) 





which equates to the expression achieved via the Bayesian update in Equation (3.5). 
C. SIMULATION 


We construct a simulation experiment, built on the simulation described in 
(Martin, 2009) and implemented in MATLAB, to further compare the Bayesian and 
Dempster-Shafer methods. Using the simulation we study the process of combining 
pieces of evidence. The simulation mimics the production of different messages and its 
main output is the distribution specifying the probability the target left at any particulate 
departure time (probability distribution over 7). We construct the simulation in two parts: 
1) generating the stream of messages that represents the state of the world and it does not 
depend on the update mechanism, and 2) updating the departure time distribution using 


different methods described thus far. For a fixed number of messages received, we run 
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the simulation multiple times and calculate the fraction of time the correct departure time 


has the highest probability after performing the updates with a certain method. 


iL Generating Messages 


First, the simulation generates a stream of messages of size k out of n possible 


departure times, assuming the informant’s reliability is g, . That is, on average, a fraction 


q, of the messages include the true departure time. Next, we examine the update 


methods on different streams of messages. We examine streams of messages that are 
generated both according to the assumption that all true messages are equally likely (see 
Chapter ID), and when this assumption is relaxed. We define an input parameter 


NU e€[0,1) to describe the measure of non-uniformity — NU =( implies that the 
messages are created uniformly, and as NU increases to 1, the non-uniformity of 
messages also increases. The exact effect of this parameter is described in the following 
paragraph. 

We assume, without loss of generality that the true departure time 7’, is the last 


one possible: T, =n. 
The method for generating the messages of size k proceeds as follows: 


e First, we determine whether the message includes the true departure time. We 


do this by generating a random Bernoulli variable with parameter gq, . 


e If the message is true, we include the true departure time in it. If it is false, we 
make sure that it does not include the true departure time. Next, we populate 


the rest of the message with departure times: 


e If NU =0, the rest of the departure times are picked uniformly, that is, each 
of the possible times have the same probability of being included in the 
message. 

e If NU>0, a random departure time ¢ is drawn from a geometric distribution 


with a parameter NU . If t e[1,...], t is not the true departure time and f is 
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not included in the message already, ¢ is added to message. Else, ¢ is drawn 
again. This process repeats itself until the message is filled up with k 
departure times. As an example, let us look at the probability of picking 


different departure times for some values of NU when n=9, as depicted in 




















Figure 6: 
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Figure 6. Probability of populating the message with departure times for different NU 
values. 


The bigger NU, the more likely smaller values will populate the message. In other 
words, messages with small values of ¢ will be more likely to be generated. Once the 
messages are created, the different update methods are used in order to estimate the true 
departure time. 

2 Estimating the Probabilities of the Departure Times 


We use six different methods to generate the combined probabilities: 
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1. Bayesian method, with a uniform prior and an updating process that assumes that 
all true message of a certain size are equally likely, and the same is true for false 
messages. The update process uses this assumption, but the messages generated 


might not be drawn according to this assumption if NU>0. 


2. Dampster-Shafer rule, where the combined probability is derived from the 


pignistic method. 


3. Dampster-Shafer rule, where the resulting probability is derived from the 


plausibility method. 


4. Zhang’s rule, where the resulting probability is derived from the plausibility 
method. 


5. Mean rule with a pignistic transformation. 
6. Mean rule with a plausibility transformation. 


Each of these methods is applied according to the description in Chapter II, 


assuming that there is an estimate of the informant’s reliability q,. Note that this 
parameter does not have to equate to the true informant’s reliability q,, since the 
estimation of the informant’s reliability might not be correct. Different values of q, and 


q, Will be tested in the simulation 


Since the messages are not necessarily created uniformly the update processes 
might use the wrong distributional assumptions. However, we are interested in 
examining how well Bayes’ update method performs even when it uses the wrong 
distributions for its updating process in comparison with Dempster-Shafer methods. 
While the Dempster-Shafer methods have less explicit assumptions than the Bayesian 
update approach, the combination rules are somewhat arbitrary and may have hidden 


assumptions that are manifested in the different combination rule. 


3s Constructing the Results 


We focus on two measures of effectiveness (MOE) in our analysis: (1) the 


average probability assigned to the true departure time, and (2) the percent of the runs in 
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which the true departure time has the highest probability. After each simulation run we 
record the probability specified for the true departure time for each of the updating 
methods. We also tabulate for each method whether the true departure time has the 
highest final probability associated with it. We do this because if the decision maker 
needs to take action, he would select the time with the highest probability. After many 
runs of the simulation we can calculate the average probability assigned to the true 
departure time as a function of the number of messages received and the percent of the 
runs in which the true departure time has the highest probability. We choose the latter as 
our main MOE. We also calculate the standard deviation of this MOE across the runs 


conducted. 


4. Input Parameters 
The input parameters of the simulation are as follows: 


1. Number of possible departure times (cardinality of T) - n 


2. Size of each message - k 


3. The true departure time — ¢, (without loss of generality it is fixed to be the latest 


possible departure time - n). 


4. The true value of the probability that a message contains the true departure time — 


q,- This parameter controls the message generation process and is not known to 


the operator. g, can be interpreted as the “true reliability” of the informant. 


5. The estimated probability that a message is true — q,. g, can be interpreted as the 


estimated reliability of the informant by the operator. The updating process 
requires an estimate of this probability, and we assume that the operator knows 
this estimation ahead of time. (It is an input to the simulation) We assume this 
value is constant but does not have to be equal to q, — as happens when the 
operator does not estimate correctly the reliability of the source. This allows us to 
generate a stream of messages that differs from the Bayes’ and Dempster-Shafer 


theory update assumptions. 


4] 


6. Non-uniformity parameter NU-— controls the non-uniformity behavior of the 


messages produced between 0 and 1. 
7. Number of messages constructed in the simulation. 


8. Number of runs for each set of parameters. 


5. Design of Experiments 
We have conducted multiple runs of different scenarios where the number of 


possible departure times is n=9(=(7\) and the size of the message is k=3. The 
departure time is fixed to be T, =n =9. 


We construct a full design with the parameter values stated in Table 2: 





Parameter Values 


d, 0.4,0.7,0.95 





d, 0.4,0.7,0.95 














NU 0,0.2,0.4,1 


Table 2. Parameter values. 





Note that if the informant is clueless and thus chooses the departure times in the 


message totally randomly (i.e., the informant provides no useful information), g, would 


equal to “=. All the q,used in the simulation (see Table 2) imply a “useful” 
n 
informant that provides true messages with probability higher than that generated from a 


uniform distribution. 


For each set of parameter values, a stream of 30 messages is created 100 times. 


6. Simulation Results 


For most of the input parameter values, the different methods produce similar 
results. Figures 7 and 8 show the results obtained for one such scenario, where: 
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q, =4, =9.7,NU =0. Figure 7 depicts the percent of the runs in which the true departure 


time has the highest probability, while Figure 8 shows the average probability assigned to 


the true departure time. 


Probability of Correct Departure Time 
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Figure 7. _ Probability of picking the correct departure time. 100 runs with parameters 


q, =4, =9.7,NU =0. 


As one would expect, the probability of choosing the correct departure time 


increases with the number of messages received, for all methods. This occurs because the 


operator gains more useful information regarding the departure time. As the number of 


received messages increases, the probability that an incorrect departure time is included 


in more messages than the correct departure time decreases. 


However, the probability assigned by the different methods to the correct 


departure time can differ significantly, as Figure 8 shows. 
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Figure 8. _ Probability assigned to the correct departure time. 100 runs with parameters 
q, =4, =9.7,NU =0. 


Based on the results in Figure 8 we can divide the six methods into three groups. 
The probability assigned to the correct departure time by the Bayesian and Zhang 
methods increases at the fastest rate as a function of number of messages in Figure 8. The 
probability the two Dempster-Shafer combination rules (plausibility and pignistic) assign 
to the correct departure times increases at a slower rate with the number of messages, 
compared to the Bayesian update and the Zhang combination rule. The reason for that is 
that the two Dempster-Shafer combination rules do not take into account the sizes of the 
sets combined when assigning the mass of the intersection of those sets, as described in 
Chapter II. Interestingly, the mean combination methods (plausibility and pignistic) do 
not change the prior probability assigned to the correct departure time. Let us look 


deeper into this phenomenon by examining the pignistic-mean method. 


A single message is true with probability g, and false with probability 1—q,. 


The expected probability assigned to it, according to the pignistic transformation defined 





in (2.50), is q, E+ (1g): tt ~ 0.18 for the parameters’ values g, = q,=0.7. Now 
n—- 


let us look at the situation after two messages, M : and M - The mass assignment that 
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corresponds to those messages is: m,(M;)= qum,(T -{M;})= l—q, and 
m,(M;)=q,.m, (r-{m;})= 1—q,. According to the mean rule (2.47), the combined 


assignment would be: 





Now let us calculate the probability assigned to the correct departure time 


according to the pignistic transformation. Let us define 0< N. <2 as the number of 

messages t is included in, as in section A. For every possible value of N, we calculate 

the probability of this NV, =n. and the probability assigned to time ¢ given it is included 
in m 


in n,, messages. The explicit calculations are shown in the following table: 
































Ny P(N, =n,,) m,,(t1N, =n ) 
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Table 3. Probability of having n, messages that include t and the mass assigned to t. 


From those three possibilities we can calculate the expected probability assigned 


to the correct time would be E| m,(t) | = >: P(N,, = n,,)-™,5 (1 IN, = n,,) =0.18. 
1, =0,1,2 


Doing the same calculation with an increasing number of messages yields similar results. 


Although the probability of the correct departure time derived by the two mean 
methods (pignistic and plausibility) is much lower than in the other methods, the 


1—0.18 





probabilities assigned to the incorrect times are ~0.10(when they are uniform). 
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Thus, as Figure 7 illustrates, using the mean Dempster-Shafer methods yield the correct 
departure time in most cases. We conclude that although the mean methods assign 
incorrect probabilities to the departure times, they still point to the departure time with 


the highest probability as accurately as the other methods. 


For most of the input combination of Table 3, the methods produced nearly 
identical results. However, there are two sets of input values that yielded 
non-trivial differences between the methods: 1) When true reliability value is low— 


q, = 4, =9.4,NU 20. For these input values the Bayesian, mean-pignistic, and Zhang- 


Plausibility methods had the highest probability of picking the correct departure 


time, and 2) when the estimated reliability is lower than the true one 
q, = 9.7,0.95,q, = 0.4, NU =0. For these input values, the combination methods differ, 


with Bayes’ and mean-pignistic methods performing the best, followed by the Zhang- 


Plausibility. 


The input values that yield the greatest difficulties for the update process are those 
with low informant reliability and some non-uniformity of the generated messages, or 
those cases where the estimated reliability is less than the true reliability. Unexpectedly, 
Bayesian update proves to be robust and performs near the top over all scenarios, even 
when Bayes’ assumptions do not hold. Mean-pignistic performed nearly as well as 
Bayes, with Zhang-plausibility performing slightly worse. The Dempster-Shafer 


combination rule and mean plausibility rule do not perform as well. 


As an example for one of cases where the methods differ, let us look closely at the 


results for input values of g, = q, =0.4,NU = 0.4, in Figure 9: 
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Probability of Correct Departure Time 
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Figure 9. _ Probability of picking the correct departure time. 100 runs with parameters 
q, = 4, =9.4,NU =0.4. 


Bayes, Zhang and mean-pignistic methods clearly outperform the other methods. 
The probability of choosing the right departure time is increasing slowly for these cases, 
but is actually decreasing for the other methods. The intuition behind this phenomenon is 
the disregarding of the set sizes (as discussed in this section) and the bias caused by 
messages generated non-uniformly towards wrong departure times that mislead the 
Dempster-Shafer combination rules. Figure 10 emphasizes this point by showing how the 


probability of the true departure time changes as the number of messages increases: 


Estimated Probability of the Correct Departure Time 
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Figure 10. Probability assigned to the correct departure time. 100 runs with parameters 
q, =4, =9.4,NU =0.4. 
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It is not surprising that the Dempster-Shafer combination rule performs poorly 
since it does not take into account the size of the combined sets. Interestingly, mean- 
pignistic method performs better than mean-plausibility method, implying that the 
pignistic transformation is more appropriate in our context. This is consistent with the 
conclusion found in (Smets, 2002) that the pignistic transformation is proper whenever 


translating beliefs to decisions is required. 


It is not trivial that the probability of choosing the correct departure time 
decreases for some of the methods. This can be explained by the non-uniformity 
parameter that causes messages that are biased toward incorrect departure times, and by 


the difficulty to cope with low reliability values. 


D. DISCUSSION 


In this chapter we compared the characteristics and performance of the updating 
methods described in Chapter II. The Bayes’ method is mathematically rigorous but 
requires a number of assumptions not needed for the Dempster-Shafer methods. 
However, there are several ways to implement Dempster-Shafer update, and it is not clear 


in advance which implementation would be most appropriate for a given scenario. 


We have developed a simulation and used it to compare the different updating 
methods under different conditions. Our analysis reveals that even when the assumptions 
of the Bayes’ update process are violated, that is, if the messages provided by the 
informant are not constructed uniformly, it still manages to yield the best results. The 
Dempster-Shafer methods did not perform better than Bayes’ update method even though 
they do not explicitly assume uniformity. Amongst the Dempster-Shafer methods, 
Zhang’s and the mean combination rules perform better. Amongst the transformations 
from Belief to probabilities, the pignistic transformation was found to be more 
appropriate in our scenario because the probability assigned is used for decision-making, 


and not just for representing the measure of belief. 


All the methods performed poorly when the reliability of the informant is low, or 


mistaken to be low, and there is non-uniformity in the way he produces messages. 
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IV. EXTENSIONS 


In the previous chapters we assume a single vessel that can only travel on one 


route. In this chapter we extend the model to include multiple routes and multiple vessels. 


A. SINGLE VESSEL AND MULTIPLE NON-INTERSECTING ROUTES 


Let us consider the case where the vessel may use one of multiple routes that do 
not intersect. Recall in Chapter II that we assume the speed of the target is fixed and 
known and therefore its location at any time could be derived from the departure time. 


Now, the location is determined by the departure time and route. 


Let Res G5 | be a set of non-intersecting routes. Thus, the tuple 


wéEW cT x Ris sufficient to describe the target location at any given time. We also 


redefine nas the number of possible combinations of departure time and route, n = [w| 


We assume that an observation gives us information regarding a combination of 
departure time and a route (“The Radar has detected a vessel leaving at 8 on the route that 
is close to the coast”). A message from the informant relates to a subset of the possible 
departure time and route combinations, for example: “The vessel leaves at 8 or 10 am on 


route | or 2,” or “The vessel will be on route 3, departure time is unknown.” 


Let w' be a two-dimensional parameter denoting the departure time and route. 
Now we can perform either the Bayesian or Dempster-Shafer theory updates as in 
Chapter II. For instance, the Bayesian update function after a positive observation w' 


regarding departure time and route would be (as in Equation (2.5)): 





ice. oe 
(1-P__)-F(w')+P,, (1- f(w’)) 
f update+ (w) = P 
a er (4.1) 
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This allows us to calculate the probability that the target has departed at time t 


from route 7, for every combination of departure time and route w = (t, r). 


In order to update messages, we must make similar assumptions to the ones made 
in Chapter II. We assume here that all the true messages of the same size are equally 
likely, where “size” refers to the number of tuples of departure time and routes 
combinations in the message. We also assume that the probability that a message is true 


is determined by the size k. 


Following Equation (2.14), the update after a message that is true with probability 








q, 1s 
Me 
k 1 weM 
De) re edd reer 
Frau(w)=4 a 
1-4, (4.2) 
: n—k ae wéM 
Lilw) y+ Di) 


Applying Dempster-Shafer theory methods to this case is also straightforward. If 
p is the probability of a true observation regarding departure time and route w’ , the 


corresponding mass assignment is: 


})=l-p (4.3) 
and the mass assignment for a message M that includes k departure times and routes is: 


m(M)= 4, 


m(W-M)=1-q, (4.4) 
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Now that we have assigned masses to the subsets W, the combination rules and 
probability transformations discussed in Chapter II can be applied exactly in the same 


manner. 


Messages already apply to multiple values in the basic model discussed in 
Chapter II, and therefore this extension is not relevant for updating messages - the update 
mechanism for the messages is exactly as in Chapter II. In the following sections we will 


discuss only the application of extensions to the update process of observations. 


B. MULTIPLE ROUTES WITH INTERSECTIONS 


Intersecting routes do not affect the updating mechanism regarding the messages 
because the time and the route aspects of the problem decouple. However, if routes can 
intersect an observation may apply to more than one route. Let us assume that we receive 
an observation O regarding an intersection point that may apply to k tuples of routes 
and departure times. An observation may apply to more than one departure time if, for 


instance, one of the routes’ departure points is further from the intersection point. 


Figure 11 depicts an example to such a scenario. In this example, an observation 
is made at 9:00 in the intersection of multiple possible routes. This observation can be 
applied to the tuples (departure time is 6:00, “Northwest-Southeast route”), (departure 


time is 7:00, “middle route”) and (departure time is 6:00, “Northeast-Southwest route’). 





Figure 11. Intersecting routes. 
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Let us define an observation as a subset of the possible tuples that it relates to 
OCW. We define the false negative and false positive probabilities of the sensor as in 
Chapter II; we denote the positive observation, meaning “there is something that 


corresponds to locations and departure times O” as O, and a negative observation, “The 
sensor did not recognize anything as belonging to O .” The False positive error is 
defined as P(O, lw’ <0) = P,, and the false negative error as P(O. lw! <0) =P. It is 


reasonable to assume that the errors do not depend on the departure time and route and 


therefore the updated probability of w’ following a positive observation O, is: 








k 
E rw) PL 5 sw) 2 


Sp late wy)= 
ne) Es (4.5) 
oa 


(1-P._) P, 


BI) ge) S 





The derivation of (4.5) follows the same derivation as Equation (2.5) and is 
similar to case of receiving a message that contains multiple departure times as in 
Equation (2.14), but with one difference: while for messages we had to assume some 
uniformity among the messages produced, here the fact that the errors are independent is 
sufficient. 

If the vessel can switch routes at the intersection points, as shown in Figure 12, 


the situation is slightly more complicated: 
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a 


Figure 12. Intersecting routes with possibility of switching routes. 


In this case we can formulate the problem as consisting of four routes, two of 
which overlap at the departure and arrival points, and all four intersect in the middle 


point. Similarly, wherever there is an intersection of 7 routes in an intersection point, we 


can define r? distinct routes. After this small alteration we can apply the update process 


defined in (4.5) to solve this case as well, 


C. MULTIPLE TARGETS 


Dealing with multiple targets is a much more complicated topic. However, when 
the total number of targets in the area of interest is known, a similar scheme to the one 
presented in sections A and B of this chapter can be applied. We assume that there is only 


one possible route, as in Chapter II. 

Let us assume that there are utargets Z = ee, . The set w= {t,z} specifies 
that target z departed at time t. We redefine our space of interest as WCTxZ, 
|W =n, which describes every possible vessel’s departure time and identity. 

An observation can relate to a specific subset of departure times and vessels. If 
the sensor can recognize vessels, the observation will include only a single one, if it can 


classify them into different categories than the observation may relate to a subset of the 


vessels. 


ak 


Assuming that the sensor’s false positive and false negative probabilities do not 
depend on the type of the target, the update function following a positive observation O, 


is: 























1-P. 
= k weO 
Sry Pols roe 
Fipaue(W)=) me " 
Pr (4.6) 
i a ; w¢O 
PUA arma TA a 


Exactly as in (4.5) 
If the sensor characteristics depend on the type of the vessel (which is reasonable, 
since bigger vessels are easier to detect, and debris are more likely to be misclassified as 


small vessels), the situation is more complicated. Now the false positive error is a 


function of the target P(O, lw <0,2) = Pg) and likewise the false negative error is 


P(O_|we0,z)=P. (2). 


The update process can be formulated as: 

















1-P,_(2) 
(ee) ple) °° 
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Pe a k + Pa tlm) — 
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This scheme works nicely for a known number of targets, however this Bayesian 
update process is straightforward to implement in cases where the number of vessels is 


not known or changes during the scenario. 
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D. DIFFERENT VELOCITIES 


Let us again consider the case of a single target and a single route but multiple 


fixed velocities. Let us define a set of possible velocities V = {Vy gees \ and discuss the 


h 


probability distribution of {tv} =w with our space of interest being WcCTxV and 
|W =n. 


An observation that was taken at time ¢ regarding a certain location — say at a 
distance d along the route needs to be translated to the corresponding subset of W: all 
the departure time and velocity combinations that would bring the vessel to the location 
of the observation at the time it was taken. The subset that corresponds to this observation 


can be constructed by: 


(4.8) 


Now that we have established the subset of W that the observation refers to, we 


can continue with the update process as in the previous sections. 


E. SUMMARY 


In this chapter we have proposed several extensions to the basic model developed 
in Chapter II to accommodate for multiple routes (with and without intersections), 
multiple targets and multiple velocities. The probability updating process for both 
observations and messages is very similar to the processes discussed in Chapters II and 
If and requires only small changes to accommodate those extensions. Therefore we 
expect that the results found in Chapter III extend to these cases as well. Accommodating 
multiple extensions simultaneously requires more bookkeeping and notation, but is 


straightforward using the techniques described in this chapter. 


Known correlations between the parameters of the model, such as the velocity and 
the type of vessel, can be incorporated into the prior distribution. The prior can also 


account for intelligence regarding the likelihood of the possible routes. 
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V. ASSESSING THE INFORMANT?’S RELIABILITY 


In previous chapters we assumed that the reliability q, of an informant is known. 


In reality this will not be known. As the operator receives more information from an 


informant, he should update his estimate of the informant’s reliability. In this chapter we 
develop a method of estimating and updating the reliability parameter q,, together with 
the distribution of the departure time 7. A key factor in updating the reliability is 


whether the operator is able to verify the truthfulness of the informant’s previous 
messages. In section A we examine the situation where the operator is able to verify the 
truth of the message, and in section B we examine the more challenging situation where 


the operator does not have this capability. 


A. MESSAGE TRUTH CAN BE VERIFIED 
We assume that the operator does not know the exact reliability parameter q, , but 


has some prior knowledge about the probability distribution of it. Recall that q,is the 


probability that a message of size k contains the true departure time. Equivalently we can 
interpret it as the long run fraction of messages that are true/correct. Over time, the 


operator observes whether messages of a certain size are true and uses this information to 
update the probability distribution of q,. This situation is similar to the problem of 


flipping a biased coin with unknown probability for obtaining a Head and updating that 


probability over time based on the observed outcome of the flips. We treat the reliability 
parameter as a random variable Q, that receives the values 0< gq, <1. Let us define X 


as a random variable denoting whether the message received is verified to be true - X =1 


or untrue - X =(). The probability of receiving a true or false message is (by definition): 


P(X =xlq,)=4@'(1-4,) 
(5.1) 


ay) 


We can use the expression in Equation (5.1) to formulate an update for the 


distribution of @Q based on the veracity of the most recent message 


P(O, =4,1X =x)= P(X =x) P(X =x) 


(5.2) 


where P(X = x)is the probability the operator verifies the message as true. 


We would like the posterior distribution of Q, to be from the same family as its 


prior, so that the calculations are tractable. In Bayesian terminology, this characteristic is 
called a “conjugate prior.” The Beta distribution satisfies this condition for the case of 
Bernoulli trials and therefore it is common to use the Beta distribution as the prior (see 


Berger, 1993). The Beta distribution has two parameters a@,f, with mean 


a 


at+P 





, and probability distribution function: 


E[Q, |= 


a- FA 
f2"(q Nee qk ‘(1-4,) 
OQ k 
B(a.f) (5.3) 
where the denominator of (5.3) is the Beta function, which is defined as 


1 
B(x,y)= fer(i-e)" dt. By substituting the prior (5.3) into the Bayes’ theorem (5.2) 


0 


we obtain the posterior distribution of Q, : 





And since P(x) is merely a normalizing term, we obtain: 


Ie (4, |x) ~ Beta(a+ x,8+1-x) 
(5.5) 
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As desired, the posterior distribution of QO. also has a Beta distribution but with 
different parameters. As seen from equation (5.5) a true message (x = 1) will increase the 
q@ parameter of the distribution by 1 while a false message will increase the ( parameter 
by 1. 

Let us limit the current discussion to a single informant, providing messages of a 
specific size k and reliability Q,. Although q, is now the value of a random variable we 
can apply the same reasoning as in Chapter II in conjunction with the law of total 


expectation to generalize the calculation of receiving a specific message M_, in Equation 





(2.11): 
Ja, F(a) 40, =7— FL] teM 
* Ceetd Lea) 
P(M, IT, =1)= 
J(-a)7 = yf (4, )4a =7——(1- EL 4) t¢M (5.6) 
Lo) Lk 


As in Chapter II, we assume that all true messages of a certain size are equally 
likely, and all false ones are also equally likely. When a new piece of intelligence is 


received, the departure time distribution can be updated using the same method as in 


a 





Chapter II, while replacing g, with E|Q, | which equals in the case of the Beta 


distribution. Since the departure time update is the same as in Chapter II, in this section 


we focus on the update of Q,. 

Let us discuss the values we should assign to the parameters @,f in the prior 
distribution. If we have no prior knowledge about the reliability of the informant, a 
common uninformative conjugate prior for gq, is Beta 7. Y)). This is known as 


Jeffrey’s prior and includes the minimal information regarding the distribution of Q, 


among all possible priors (Berger, 1993). However, if we do have a prior estimate of the 
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a 
reliability we can use it to set the prior. The mean reliability is given by ——~=E [ 2, |: 
a 


+B 
The “weight” of the prior can be controlled by the values of qand f - the larger qand 
£, the less the posterior distribution will change with new evidence. Let us look at two 


cases: in the first case @=f=1 and in the second case, a= f=10. In both cases 


1 
E[Q, |= —, however after a single true message that has been verified, the estimated 
2 


+1 z 
reliability of the informant in the first case would be E [ 2, | oe while in the 
at+1+B 3 
+1 11 
second case E|Q, | pee Pe ake ; 
a+1+B 21 


We assume that the larger the message size k the larger FE [ 2, | will be since there 


are more possibilities for the message to contain the true departure time. We would also 
k 74 99 ‘ in . 

expect E 2, | 2— for a “useful” informant that provides correct messages with 
n 


probability higher than that generated uniformly random. 


In practice the verification can occur in many ways, for example later intelligence 
may confirm without any doubt the location of the vessel, or the perhaps by the capture 
and interrogation of drug smugglers. Once the true departure time of the vessel is 
determined, the truthfulness of the messages received can determined as well, and can be 
used to update the reliability of the informant, as described in this section. The updated 
reliability can then be used to better estimate the departure time of other vessels based on 


messages from the same informant. 


B. UNVERIFIED MESSAGES 


In many situations, the operator cannot verify the informant’s information. 
However, we still want to use the new information provided by the informant to update 
both the departure time distribution and informant’s reliability. For instance, if we receive 


a message that contradicts everything we know so far, it is more likely that the informant 
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is incorrect, and therefore his reliability should be updated downwards. Unlike the update 
procedure in section A of this chapter where we update the time departure distribution 
and the reliability sequentially, here we update the time departure distribution and the 


reliability parameter simultaneously. 


1. The Bayesian Update Process 


The main idea behind the simultaneous update process is to define a joint 


probability distribution for the time departure 7’and reliability Q,, denoted by 
tro (2 aa) This is a mixed joint distribution, since the time parameter is a discrete 


random variable while the reliability parameter is continuous. As in section A of this 


chapter, the random reliability parameter Q, will take value q,. 


The update of the joint distribution of the true departure time 7’, and reliability 


parameter Q, when a new message M_ is received can be calculated by applying Bayes’ 


theorem: 


fro, (1,.9,)= P(M,) 


(5.7) 


In order to evaluate (5.7) we need to define the prior aes (1 1) and the update 


P(M, IT, =t,,0, =4,) | 


function f es as) 7 - (M 
k 


Ty Q, 


prior 


7,0, (1 “ida, is combined from two parts: (1) departure time, and (2) 


The prior 
the reliability. We assume that these two components are independent for the prior. For 
the departure time, the prior can include any information, however without further 
knowledge we assume it to be uniform among the departure times: 7, ~ U [T]. For the 
reliability part we assume that a prior from the Beta family as in section A is appropriate 
and so Q, ~ Beta(a, p). The independence between 7’ and Q, holds only for the prior, 


but not for the updated distribution. The final expression of the joint distribution prior is: 
61 


aaa) 
n  B(a.f) (5.8) 


P(M, IT, =1,,0, =4;) 
P(M,) | 





tro. (%)= P(E, =") F 0, (%) 
Next let us derive the update function f sist nae 


qT; Q, 


First we need to calculate P(M ‘ IT, = EO: = 4,): As in Chapter II, Equation (2.11), the 
probability of receiving a message M_ given the true departure time 7’, and reliability 


parameter Q, is: 


= =~ t Ee M, 
i 
Let) 
P(M, 'T, =1,0, =4,)= a 
tk tem, (5.9) 
ae ee 
Lok) 
ae é Fi . 1 s eM, : 
By defining an indicator function J (s eM :) = we can combine the 
0 seM, 
two cases in (5.9): 
\ Uae re) 
qd, 1-q, 


P(M, IT, =1.0, =49,)= at eee ee 


( ( n-1 | (5.10) 
C2) mee ae 
Notice that there are two ways for the probability of receiving a certain message 
M , in (5.10) to be large. Either the informant is considered reliable (high Q_ ) and the 
message includes the true departure time, or the message does not include the true 


departure time and the informant is considered unreliable (low Q_ ) 
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The denominator of the update function P(M :) is the normalizing term and can 
be calculated by integrating over 7’, and Q,: 


JSR L t.q,)-P(M, IT, =1,0, = 9, dq, 
a (5.11) 


which in our case for the prior translates to: 








Sf lca a ppt wlieal, 
ao a i en 


k-1 k ) 
k B(a+1,) n—k B(a.B+1)_ki{n—k)! 1 (5.12) 


[n-1\ Bla.p) “A "!) a) ne: n! 
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This simple expression for P(M ) — the probability that the informant provides 


the message M_— before the arrival of any new information is not surprising given the 
uniform prior and uniform generation of messages. There are [ bi possible messages 
k 


that can be generated by picking k departure times from n possibilities and so the 





probability of picking one of them uniformly is 


— 


( n 
Li 
Now that we have both the prior and the update function we can evaluate the new 


joint distribution after receiving a message: 


\ 1-i(teM) 





(5.13) 
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Combining the prior Equation (5.8) and the update function (5.13), the updated 


joint distribution is: 


iad Cole aaa’ aie Ag 
FEaloal" Ee) Uk) on Bed 





(5.14) 


k 





A convenient way to think about this joint distribution is by envisioning n 


different reliability parameters Q, ,, one for each possible departure time. Each of those 
variables has a Beta distribution with parameters @,andf,. To update the joint 
probability distribution we increase q@, by | for departure times that are included in the 
message and increase f£, by | for departure times that are not included in the message. 


This increases the expected Q, , for departure times that are included in messages, and 


decreases the expected Q_ , of departure times that are not included. 


2 The Marginal Distributions after a Single Message 


In order to gain more insight on the influence of a message on the distribution of 
T, and Q_, we calculate the marginal distributions after a single message. 
a. The Marginal Distribution of T, 
The general expression for calculating a marginal distribution of a joint 


distribution is P(T, = t) SAG ge _(t.4, )dq;.- Applying it to the joint distribution defined 


oa 


in (5.14) yields: 
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—. [qi -(1-4,) pe ACS teM, 
1 ks k 
P(T, =t)= ; 
Ba.) B(a. +1) (5.15) 
$= t ~€M, 
n—k 
But we can simplify this expression since the Beta function has the property 
B(a+1,£) a rae er Oa, feet 
= so the marginal distribution of the departure time is simply: 
B(a.f) at+fp 
7 7: teM, 
a+ 
P(T,=1)=) (5.16) 
: x téM, 
n-k a+f 


Just like in Chapter II, Equation (2.15), the probability of a certain 


departure time, given a message M_ depends on the size of the message k, and the 





expected value of the reliability of the message = E|Q, |. In order for the 


a 
+B 


marginal distribution of the times that are in the message to increase, we require that 


1 @ 1 Bp a k : ; bate ; 
ee , or ——~>-—. We can interpret this condition as stating the 
k at+P n-k a+fP at+p on 





informant must be “useful,” that is, one that has a higher probability to generate a true 
message than if one were to pick a message uniformly at random. The probability that the 


message contains the true value is equal to the expected reliability: 








Le a 
CN) aaa ag (5.17) 


which is exactly the desired quality of the reliability notion. 
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b. The Marginal Distribution of Q, 


The general formula for the marginal distribution over a discrete random 


variable is fi (4,) = >: f, maldet ) which in our case translates to: 
to 


alma) g(a) a (ma) 


(ol aaa)" Blee) awa) 


The marginal distribution of Q, after a single message remains the same. 


This result is intuitive. Because the prior distribution of the departure time is uniform, we 
cannot utilize that distribution to gain information about whether the first message is true 
or false (or in other words, all the possible messages are as likely to be received) and 
therefore the estimated reliability remains unchanged. We cannot draw any information 
about the informant from his message because we did not have any specific information 


about the departure times. If the prior distribution for the departure time was not uniform, 


the marginal distribution of Q, would change. 


The mean of the reliability is the same as before this message: 





1 Bla+LP 
BLO, ]= | fo,(4)aeda, ee =F (5.19) 


In order to make this model clearer, let us look at a numeric example, 
following the example in Chapter II, section D.4. We will assume that there are only four 


possible departure times: T = {t,,t,,t,,t,}, of which the true departure time is t;. We 


receive a message of size kK=2, M, ={t,,t, | and the reliability our informant for messages 
of size 2 is Q, ~ Beta(a =9, p= 1) (and E [ 2, | =(.9). The prior probability distribution 


is assumed to be uniform: ff, (:,) at er (x,) aa ae (:,) Jee (x,) =0,25 


The joint distribution, as developed in Equation (5.14) in this example is 


calculated to be: 
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s-(1-4,) T,=t,t 
2 B(9,1) aS 
7, (64) = (5.20) 
1 @,(1=4) T =t..t 
2 BOA). EO 
The updated marginal distribution will be: 
1 a-(I-a) 1 Ao 


ja Coat ea 0 =t)=\5 aaa) ee 


0 





2 
(5.21) 
Ly Aqe. 1 i(! _ 1 
f(Z, = ‘= f(Z, = t,)= [5 ae = hans 0.05 
Which is in full agreement with the results in (2.16). 
The marginal distribution of Q, is now: 
0 1 0 
f,(a)= g.(1-4,) Fs q'(1-4,) _ q'(1-4,) 
B(9.1) B(9,1) B(9.1) (5.22) 


which means that Q, ~ Beta(9,1) as before the message. 


In the following section, we will discuss an example where the marginal 


distribution of Q, does change after the update. 


3. The Joint Distribution after Two Identical Messages 


We will now examine how the joint distribution evolves after two identical 
messages. By applying the update process described in (5.7) twice, with the new prior 


being the resulting posterior distribution after the first update (5.14) we obtain the 


expression for the distribution after two identical messages M, with C as the 


normalizing constant: 
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E (4, ) ee ia e I id Ver A 
Fi t04)=€ [bed Geek) in Ba.) (5.23) 


By inserting all the constants into the normalizing parameter and renaming it, we 











obtain: 

Besa : (1 ates) 1 ee) a-1+2.1(seM,) p-1+2{1-1(seM,)) 
Tro, (na)=C'LE] = a (4) (5.24) 
We calculate the normalizing term C' by integrating the joint density to 1: 

eee 14.4, dq, =| 
t 0 
p Pee) ee) =] 
: (nk) (5.25) 
7 1 
B(a + 2, B) ie B(a.B+2) 
k n—-k 


And now we can calculate the updated distribution in (5.24). Unlike in section 2b 


of this chapter, the marginal distribution of Q, now changes after the second message. 


We compute f 2 (%)=> flay!) using the new distribution after two messages 
teT 


obtained in (5.24) 





(5.26) 


We can see that the marginal distribution of Q, that was a single Beta distribution 
before the update became a mixture of two Beta distributions after the update. The mean 


of Q, after two messages is: 
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1 atl B a-l ft 
qi (1-4,) qi (1 ) 
=C' d. 27, 
J i ae V4, (5.27) 
a(a+1) 
a pe k 
= ——_ + ——_: 
a+B+2 at+fP+2 a(a+1) B(B+1) 
+ 
k n-k 
; a+ k 
For the case in which the parameters satisfy —————- >— the mean of the 
a+B+2 n 


updated marginal distribution is bigger than before the update: 














a(a +1) 
ie 2 k 
Eee pe ape cael p(B+1)~ 
k J n—-k (5.28) 
a a +apr+a 
a atB a+Bt+2  @ 
at B+2 atB+2- at+p —atZB 


k : 
>— for the informant to be 
at+B on 





This is a reminiscent of the requirement of 


considered useful from the section 2.b. Note that in cases when the informant is not 


“useful” (meaning, his mean reliability is lower than what it would be if he had picked 
departure times randomly), the updated E|Q, | might decrease. This occurs because if 


we believe the informant is very unreliable then we believe that the departure times that 
he stated in the message are less likely, and in this light, we downgrade his reliability 


even further. 
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4. Visualization of the Update Process 


We develop a small simulation that visualizes the joint distribution update 


process. In the following example n=9,k =3 and the initial Q, parameters value are 
a=4, f= leading to E[ Q, |=08. 


The joint distribution before any messages arrive is depicted in Figure 13: 


Joint distribution of correctness and departure time 


De parte time 
o ~ o ow > oo nm 





0 0.2 0.4 0.6 0.8 1 
Correctness -q 
vZ 


Figure 13. Joint prior distribution. 


As we can see the variables are independent. The departure time is uniform, and 


the higher values of Q, are more likely. Now let us examine what happens after 


receiving a single message M_ = al f The joint distribution changes to: 
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Joint distribution of correctness and departure time 


Departue time 
oo ~ c=) ow +> wo nm 





0 0.2 0.4 0.6 0.8 1 
Correctness -q 


Figure 14. Joint distribution after a single message. 


This implies that either the true departure time is one of (Liat. } and the 
reliability of the informant is rather high, or the true departure time is not one of the 
above, and the informant reliability is low. If we receive another message 


M,= {T. oy ae \, the distribution update can be seen in Figure 15: 
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Joint distribution of correctness and departure time 


Departue time 
oo ~ o> ow +> wo nm 





0 0.2 0.4 0.6 0.8 1 
Correctness -q 


Figure 15. Joint distribution after a two messages. 


It is very likely that the true departure time is T9, with a very reliable informant. 
However, there is still some probability that the true departure time is not 7, and the 


reliability is lower. 


5, Change of 7, and Q, with Number Messages 


It is also interesting to examine the distribution of 7), and the mean of Q, after 
multiple messages. In the following example, n=2,k =1 and the initial Q parameters 


value are a = 3,8 =2 leading to E| Q, | =0.6. 


Let us assume we receive 10 identical messages: M, = {7}. The distribution of 


T,, as function of number of messages is plotted in Figure 16: 
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an Marginal Distribution of Td 


0.8 


0 2 4 6 8 10 
Number of messages 


Figure 16. Marginal distribution of T,. 


As expected, the probability of 7) increases with the number of messages, while 


the probability of 7, decreases. 


Figure 17 shows how the mean of Q, changes with the number of messages 


received: 
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E[Qk] 


E[Ck] 





0.55 5 


1 2 3 4 5 6 7 8 9 10 
Number of messages 


Figure 17. Mean of Q,. 


As more messages arrive, the mean of Q, increases. The repeated messages 
confirm both the departure time (as shown in Figure 16) and the informant’s reliability. 


Note that the first message does not change the mean of Q. . 


Now to a slightly more interesting example: let us assume that n = 2,k = 1 and the 


initial Q, parameters value are a = 2, =3 leading to E | 2, | = 0.4. The mean of Q, now 


is lower than the uniform probability of picking 7| at random, and therefore the 
informant is more likely to provide incorrect information. Again, we assume we receive 


10 identical messages: M, = {T, \ 


1 


The marginal distributions in this case are: 
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Ae Marginal Distribution of Td 


07 


“0 2 4 6 8 10 
Number of messages 


Figure 18. Marginal distribution of T,. 


Because the initial reliability is so low, we believe the informant is misleading us 


and thus the probability for 7 decreases and the probability of T, increases. 


7D 


E[Qk] 


E[Ck] 





0 1 2 3 4 5 6 7 8 9 10 
Number of messages 


Figure 19. Mean of Q.. 


The mean of Q, is decreasing with the number of messages. This result is also 


quite intuitive: since the probability of 7) is decreasing, informant’s repeated messages 


that include 7, are considered less true and the estimated reliability of the informant is 


downgraded. 


C. DISCUSSION 


We have proposed a scheme for simultaneous update of the reliability and the 
estimation of the departure time. This scheme includes a mixed joint distribution of a 
discrete (T,) and a continuous (Q,) random variables that updates in a Bayesian fashion. 
As we saw in Chapter III, even if the assumptions are not fulfilled, the performance of the 


Bayesian method is satisfactory. 


The scheme proposed makes use of conjugate Beta functions ensuring simple 
calculations that are easy to implement while maintaining flexibility to specify the mean 


reliability of an informant and the “strength” of our estimation regarding this reliability. 
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Although the reliability is unknown, and the true departure time is also unknown 
we can still estimate both of them and improve our estimate as we receive more 
intelligence. The scheme takes full use of the information provided by the informant to 


efficiently update the informant’s reliability and the vessel’s location simultaneously. 
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VI. SUMMARY, CONCLUSIONS, AND FUTURE WORK 


A. DATA FUSION 


In this thesis we formulate a model to assist the Joint Interagency Task Force 
South in its efforts to fight drug traffickers originating from South America. The main 
problem addressed in this thesis is how to combine different sources of intelligence into a 
coherent picture to effectively estimate the location of drug smugglers. In the initial 
model, we focus on determining the departure time of a smuggler. In later chapters we 
develop methods to estimate the route the smuggler travels, the vessel type, and velocity. 
The main contribution of this thesis is developing models to fuse information from two 
different types of intelligence sources, namely sensor-based sources and human-based 
sources, into a coherent intelligence picture. We update this picture as new information 


arrives. 


The main model we explore is the Bayesian model, which is quite intuitive, 
mathematically rigorous and elegant. However, this method requires assumptions 
regarding the underlying probability distributions related to the intelligence gathered. 
Those assumptions are usually difficult to justify in practice since their validation 


requires gathering large amounts of data. 


We compare the Bayesian model to a different type of intelligence updating 
mechanism, the Dempster-Shafer method. We examine several ways to implement 
Dempster-Shafer theory and compare those methods to Bayes’ theory both qualitatively 
and quantitatively. The quantitative comparison is done using a simulation across 
multiple possible values of an informant’s reliability and ways in which the messages are 


created. 


We found that even when the assumptions of the Bayes’ update process are 
violated, it still manages to yield the best results in the scenarios examined. It specifies 
the correct departure time a larger fraction of the time than the other methods. All the 
updating methods perform poorly when the reliability of the informant is low or is 


mistaken to be low, and there is non-uniformity in the way he produces messages. 
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B. UPDATING THE INFORMANTS RELIABILITY 


A major contribution of this thesis is a Bayesian model that allows the operator to 
assess the reliability of the informant and update the vessels location simultaneously. We 
can do this when the informants’ messages can be verified and when they cannot. The 
informant’s reliability-departure time joint distribution model, described in detail in 
Chapter V, allows estimating both the location of the vessel and the reliability of the 
informant together and updating the estimate as more intelligence is received. Even 
though neither the true departure time of the vessel nor the reliability of the informant are 


known initially. 


C. FUTURE WORK 


This thesis suggests multiple models for updating the operator’s perception as he 
receives more intelligence and sets a framework for the comparison and evaluation of 
those data fusion models. However, the research on the models developed can be 


extended in the following ways: 


1 Extending the Model 


In Chapter IV, multiple extensions to the basic model were suggested, but in order 


to encompass real-life situations, one may extend the model even further. 
Possible extensions of interest are: 
e Accounting for variable velocities of the vessels. 


e Accounting for the case where the number of vessels in the theater of 


operations changes over time. 


e Evaluating the probability that the informant delivers a message of a 


certain size /,. In our analysis we assumed that this probability is known, 


but in fact it can be evaluated as more information arrives in a similar 


fashion to the one used in Chapter V to estimate the reliability q,. 
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2 More Extensive Comparison 


Although we have shown that the Bayes’ model preforms best in the scenarios 
examined, the intelligence community may still benefit from more exhaustive 


comparison between the models. 


A worthwhile direction to improve the comparison conducted in this thesis is by 
examining the models with other streams of messages, created in different ways than 
described in Chapter III. Examining the probability of specifying the correct departure 
time after receiving messages of different sizes from multiple informants with different 


reliabilities may also be of interest 
Lastly, comparing the computational complexity of the update methods directly 
by computing the time required to perform the computations of different update methods. 
3s Real Data 


Inputting the models with real data may increase immensely the insights we can 
gain from the models and allow us to compare them more effectively. Such real data may 


relate to 1) prior knowledge about the vessels departure times, velocities and routes, 2) 


the characteristics of sensors, namely the false positive error Pe. and the false negative 


error Pee and 3) the characteristics of the informants such as their reliability and most 


interestingly — the way in which they produce their messages, and what types of mistakes 


they tend to make. 
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